Identification of an mRNA isoform switch for HNRNPA1 in breast cancers

Roles of HNRNPA1 are beginning to emerge in cancers; however, mechanisms causing deregulation of HNRNPA1 function remain elusive. Here, we describe an isoform switch between the 3′-UTR isoforms of HNRNPA1 in breast cancers. We show that the dominantly expressed isoform in mammary tissue has a short half-life. In breast cancers, this isoform is downregulated in favor of a stable isoform. The stable isoform is expressed more in breast cancers, and more HNRNPA1 protein is synthesized from this isoform. High HNRNPA1 protein levels correlate with poor survival in patients. In support of this, silencing of HNRNPA1 causes a reversal in neoplastic phenotypes, including proliferation, clonogenic potential, migration, and invasion. In addition, silencing of HNRNPA1 results in the downregulation of microRNAs that map to intragenic regions. Among these miRNAs, miR-21 is known for its transcriptional upregulation in breast and numerous other cancers. Altogether, the cancer-specific isoform switch we describe here for HNRNPA1 emphasizes the need to study gene expression at the isoform level in cancers to identify novel cases of oncogene activation.


Results
Isoform level analysis. In a targeted screen for HNRNPA1 expression in breast cancers, we re-analyzed GSE31519, GSE2034, GSE7390 datasets using APADetect, an algorithm to detect isoform level expression differences based on differential poly(A) site usage 16,17 . We analyzed data sets for probe-level differences based on the positions of poly(A) sites. Probes proximal to poly(A) sites generally recognize all isoforms, whereas distal probes recognize longer isoforms. Ratios of proximal to distal probe sets were calculated for normal and cancer samples. Significant changes in the signal intensities were reported as SLR ((Short + Long)/Long ratio). In breast cancer patients (n = 856), independent from tumor type, we observed significant downregulation (p < 0.0001) of an HNRNPA1 isoform that ends with a distal poly (A) site on the gene locus (Hs.546261.1.27) (Fig. 1A). The downregulated isoform (hereon called Isoform-1) has a different 3'UTR than other isoforms of HNRNPA1 due to the inclusion of two non-coding terminal exons (exon 12,13). Only the distal probes of 214280_x_at (Affymetrix probe set ID) recognize Isoform-1 specifically.
On the contrary, the SLR of other isoforms was high in patients compared to normal breast tissue (Fig. 1B). The Isoform-2, 3, 4 were co-detected (but could not be distinguished) and quantified by the distal probes of the 200016_x_at probe set as these isoforms have the same terminal exon (exon 11). These isoforms either have a shorter 3'UTR (Isoform-2) or a longer 3'UTR (Isoform-3, 4). Overall, breast cancer patients have an increased isoform ratio (Short + Long isoforms)/Long isoforms) for HNRNPA1. The ratio shift was significant; however, we could not determine individual expression levels because both proximal and distal probes recognize multiple isoforms. Interestingly, increased expression of isoforms detected by proximal probes of 200016_x_at correlates with patient relapse times in the GSE31519 cohort (Fig. 1C). These data suggested that isoforms are differentially expressed in patients.
To begin confirming the in silico patient data, we first validated the 3′-ends of isoforms by 3′-RACE, cloning, and sequencing (Supplementary Figs. S1-S3). Next, we tested breast cancer cell lines (n = 18) and a panel of breast cancer patient cDNAs (n = 25) by RT-qPCR ( Supplementary Fig. S4). We detected an increased ratio of isoforms compared to Isoform-1 in 40% of breast cancer cell lines and approximately 90% of patient samples. However, the culprit of microarray data and RT-qPCR was the use of probes and primers recognizing more than one isoform. To delineate isoform-specific expression, we turned to RNA-seq data of Genotype-Tissue Expression (GTEx) and TCGA datasets. We compared the expression of isoforms in GTEx normal tissue samples to TCGA tumor samples using UCSC Xena, which allows comparison of the two datasets [18][19][20] .
The isoform switch was also evident in single-cell RNA-sequencing data of normal mammary cells compared with breast cancer cells (GSE113197, GSE75688) (Fig. 2F). With these results, the reason behind SLR change in breast cancers (Fig. 1B) became apparent. Higher SLR was due to the upregulation of Isoform-2 and downregulation of other isoforms.
Isoform switch and HNRPA1 protein levels. At this point, we wanted to investigate the functional consequences of the isoform switch. However, we were surprised to find out that Isoform-1 appears as a non-coding transcript (NR_135167). The coding sequence of HNRNPA1 ends in exon 10, and all isoforms share the same stop site despite having different terminal exons ( Supplementary Fig. S5). Hence to find out whether Isoform-1 is coding for a peptide, we first calculated the coding potential using the Coding Potential Calculator 2 (CPC2) algorithm 22 and saw that it was similar to other isoforms (Fig. 3A). To verify the coding potential experimentally, we performed ribosomal affinity purification (TRAP) of translated mRNAs, performed RT-qPCR, and normalized the polysome-bound transcript levels to no-TRAP control cells. These results showed that all isoforms were associated with the immunoprecipitated polysomes. XIST non-coding RNA was used as a negative control (Fig. 3B).
Next, we cloned the coding sequence of HNRNPA1 along with the different 3′UTRs of the isoforms. Isoform-1 has a unique 3'UTR (Iso-1-3′UTR). Other isoforms share the same terminal exons, but Isoform-2 has a shorter 3′UTR (S-3′UTR) compared to the long 3′UTRs (L-3′UTR) of Isoform-3 and Isoform-4. Therefore, we tested whether these three types of 3′UTRs had different effects on protein levels. We transfected MCF7 and MDA-MB-231 cells with the HA-tagged HNRNPA1 protein expression constructs. Protein expression was detected by western blotting. Of interest, the level of HNRNPA1 protein encoded by the construct with Iso-1-3′UTR was markedly lower than the other isoforms (Fig. 3C). Because transfection efficiency could be a reason for this observation, we cloned the three different 3′UTRs downstream of a reporter gene and transiently transfected cells for a dual luciferase assay where transfection efficiencies were normalized. Here too, the luciferase reporters for the S-3′UTR and L-3′UTR had significantly higher activities than Iso-1-3′UTR in both cells (Fig. 3D). Results from the 3′UTR-reporter system, along with forced expression of HA-tagged proteins, suggested that expression of Isoform-1 correlated with lower protein levels. Since we did not see a difference in ribosome association of isoforms ( Fig. 3A), we tested whether mRNA half-lives could affect HNRNPA1 protein abundance. We tested mRNA levels of HNRNPA1 isoforms following actinomycin D treatment for 12 h to prevent new transcription. RT-qPCR results showed that Isoform-1 had a short half-life, comparable to MYC mRNA, well-known for its short half-life 23 (Fig. 4A). In contrast, other isoforms were still stable after 12 h when we finalized the experiment in MCF7 and MDA-MB-231 cells. Similar decay rates were determined in MCF10A cells (non-tumorigenic mammary epithelial) ( Supplementary Fig. S6).
We also treated cells with actinomycin D and cycloheximide, an inhibitor of ribosomal elongation 24 . Interestingly, cycloheximide treatment for only 3 h had a dramatic recovery effect only for Isoform 1 (> 3.5 fold in MCF7, > 7.5 fold in MDA-MB-231) (Fig. 4B, Supplementary Fig. S7). This quick recovery suggests that Isoform-1 is co-translationally degraded, as cycloheximide is also known to inhibit mRNA decay 25 .
These results indicated that the isoform switch results in differential expression of isoforms with different mRNA stabilities, affecting protein levels. However, we also tested whether 3′UTRs may regulate the localization of HNRNPA1 protein, as was suggested for a few interesting cases 26,27 . In this case, the nuclear localization of HNRNPA1 was independent of 3′UTR sequences of isoforms ( Supplementary Fig. S8).
Overall, these results showed that Isoform-1 is a rapidly degraded mRNA, possibly better regulating the protein level of HNRNPA1. This unstable isoform is low in breast cancers, whereas Isoform-2 is upregulated. Because this switch would indicate upregulation of HNRNPA1 protein, we were curious to investigate HNRNPA1 protein levels in patient samples. Hence, we took advantage of a quantitative liquid chromatography/mass spectrometry-based proteome analysis dataset, which used protein extracts from breast tumors and adjacent noncancerous tissues 28 . In this dataset, HNRNPA1 protein was significantly high in 52 tumors compared to normal tissues and 13 basal-like tumors compared with normal tissue (Fig. 5A). High HNRNPA1 protein levels in these patients correlated with decreased survival, strengthening the significance of the oncogenic role of HNRNPA1 (Fig. 5B). Moreover, in an independent dataset of Clinical Proteomic Tumor Analysis Consortium (CPTAC), HNRNPA1 protein was also overexpressed in luminal, HER2+, and TNBC tumors (Fig. 5C). Of note, posttranslational modifications and protein-protein interactions are likely to introduce additional layers of regulation to HNRNPA1 activity in cells.
HNRNPA1 silenced models and intragenic miRNAs. Next, to begin addressing the biological relevance of HNRNPA1 overexpression in breast cancers, we generated stable shRNA constructs to target HNRNPA1 expression in MCF7 and MDA-MB-231 cells (Fig. 5D) and tested these models for changes in their neoplastic phenotypes. HNRNPA1 is a versatile RNA-binding protein involved in many aspects of RNA biology, so we found a significant reversal of neoplastic phenotypes in both silencing models. We observed loss of clonogenicity (Fig. 5E), decreased proliferation ( Fig. 5F), decreased motility ( Fig. 5G), decreased migration and invasion capability (Fig. 5H) upon sustained silencing of HNRNPA1.
Next, to shed light on the possible effects of HNRNPA1 activity in breast cancers, we turned to microRNAs (miRNAs) as a less explored aspect of HNRNPA1 function. HNRNPA1 has been implicated in promoting or hindering the processing steps of pri-miR-18a and pri-let-7a-1 by direct binding to loop regions 29,30 . Because HNRNPA1 has fundamental roles in RNA biology, we wanted to test whether other miRNAs would be affected by HNRNPA1 silencing. We used HNRNPA1 silenced MCF7 and MDA-MB-231 cells (Fig. 6A,D) to screen approximately 800 miRNAs using the NanoString technology (nCounter Human miRNA assay). We detected XIST, a non-coding RNA, was used as a negative control (**p < 0.01; n = 3, Student's t-test). (C) Cells were transiently transfected with indicated vectors, and lysates were collected. HA antibody was used to detect HNRNPA1 levels. Same blots were hybridized with ACTB antibody. The image is representative of 3 independent experiments. Graphs show densitometric quantification of bands (*p < 0.05, **p < 0.01, one-way ANOVA, Tukey's HSD), uncropped images are presented in Supplementary Fig. S13. (D) Different 3′UTRs were cloned downstream of the luciferase gene in the pMIR vector. Cells were transiently transfected, and Firefly/Renilla luciferase readouts from the constructs were normalized to that of empty pMIR (*p < 0.05, **p < 0.01, ****p < 0.0001; n = 3 independent transfections, one-way ANOVA, Tukey's HSD). Notably, most downregulated miRNAs in both models were intragenic and mapped to introns of host genes (Fig. 6B,E). We reasoned that decreased expression of host genes might explain the downregulation of these miR-NAs. Indeed, low miRNA read counts correlated with downregulated mature miR-27b-3p and pri-miR-27b-3p levels along with its host gene C9ORF3 on 9q22.32 in MDA-MB-231 cells upon HNRNPA1 silencing (Fig. 7A). These results suggested that the downregulation of miR-27b was due to decreased transcription of the host gene.   RPLP0 as a stable mRNA and MYC as an unstable mRNA were used as controls (n = 3 independent treatments, ****p < 0.0001; n = 3, Student's t-test). (B) Cells were treated with actinomycin D and/or cycloheximide (CHX, 100 µg/mL) for 3 h to prevent transcription and translation. EtOH (Ethanol) and DMSO are carrier controls. Cells were collected and RNA was isolated for RT-qPCR (*p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001, ns: not significant, n = 3 independent experiments, student's t-test). www.nature.com/scientificreports/ Next, miR-21 caught our attention as one of the most abundantly expressed and studied miRNAs in breast and other cancers 31,32 . RT-qPCR verified low read counts for miR-21. Pri-miR21 levels were also low in MCF7 cells upon HNRNPA1 silencing (Fig. 7B). MiR-21 gene resides within the intron 11 of VMP1 (Vacuole Membrane Protein-1) on 17q23.2. Hence, we tested whether VMP1 was also downregulated in HNRNPA1 silenced cells to explain the mechanism behind decreased pri-miR-21 levels. However, there was only a minimal decrease in VMP1 mRNA, which was unlikely to explain low levels of pri-miR-21 in HNRNPA1 silenced cells (Fig. 7B). However, VMP1 mRNA is not the only source for miR-21 biogenesis; additional miR-21 promoters and primary transcripts have been characterized from within the terminal intronic regions of VMP1 33 . To test whether the activity of this promoter region 34 was different in HNRNPA1 silenced cells, we cloned the well-defined promoter region for miR-21, a 433 bp region between − 3770 to − 3337 relative to the hairpin, into the pGL3-Basic promoter vector, driving Firefly luciferase expression. Transfection efficiency was monitored with phRL-TK driving the expression of the Renilla luciferase. Control cells (NT-sh) and HNRNPA1 silenced cells were transiently transfected with both vectors. We observed that the luciferase enzyme activity from the pGL3-miR-21 promoter was approximately 70% lower in HNRNPA1 silenced cells than in the control cells (NT) (Fig. 7C). These findings collectively show that miR-21 and pri-miR-21 levels were downregulated mainly due to the decreased activity of the miR-21 promoter in HNRNPA1 silenced cells.
To test the functionality of miR-21 downregulation on potential targets, we chose to generate a miR-21 sensor rather than testing known mRNA targets because HNRNPA1 loss is likely to alter levels/functions of many other coding and non-coding genes. Hence, we cloned two complementary binding sites for miR-21 downstream of Firefly luciferase CDS. We transfected control and HNRNPA1 silenced cells with this sensor. As a result, we detected higher luciferase activity from the miR-21 sensor in HNRNPA1 silenced cells due to less miR-21 binding to the 3'UTR of the luciferase mRNA (Fig. 7D). A mutant construct lacking the seed sequences of miR-21 had similar luciferase activities in control and HNRNPA1 silenced cells, showing the specificity of the sensor (Supplementary Fig. S10). miR-21 is upregulated in breast cancers, and this upregulation impacts the overall survival of ER+ breast cancers ( Supplementary Fig. 10, Fig. 7E). While the effect of HNRNPA1 on miR-21 transcription is possibly indirect, an HNRNPA1 guided network may hold the potential to decipher transcriptional deregulation of miRNAs implicated in cancers.
Finally, we also took an independent approach and targeted the HNRNPA1 gene locus with CRISPR/Cas9. We confirmed decreased expression levels of miRNAs in HNRNPA1 deleted cells (Supplementary Fig. S11); however, the cells were not viable for continued culturing, showing HNRNPA1 dependency of cells. Indeed, most breast cancer cell lines have low "gene effect scores" indicating a high likelihood that HNRNPA1 is an essential gene in depletion assays (Supplementary Fig. S12). Our data and dependency scores collectively suggest that the HNRNPA1 function is critical and that cells cannot rescue its loss. Notably, because disease mutations have been reported for HNRNPA1 along with mutations in HNRNPA2B1 35 , we still looked into expression patterns of the two transcripts. We found no significant correlation in more than a thousand breast cancer patient samples ( Supplementary Fig. S12).
Overall, we report an isoform switch for HNRNPA1 and provide insight into the oncogenic roles of HNRNPA1 as a versatile RNA-binding protein whose expression is critical for the neoplastic phenotypes of breast cancer cells. Our findings, specifically on miR-21, may help understand how oncogenic miRNAs are frequently elevated in cancers.

Discussion
Mechanisms leading to alternative processing of mRNAs are gaining more attention as we begin to appreciate the complexities of cancer transcriptomes 36,37 . Accordingly, widespread expression of alternatively spliced or polyadenylated isoforms has been described in cancers 9,14,38,39 . As part of this complexity, cancer-specific isoform switches change the ratio of mRNA isoforms that may differ in their CDSs or 3'UTR sequences, consequently modulating protein functions in cancer cells. Hence, an increased appreciation of isoform switches may help the discovery of overlooked cancer-related genes and provide new avenues for diagnostic and therapeutic applications.
HNRNPA1 is a versatile protein involved in diverse aspects of RNA biology, including mRNA trafficking, telomere maintenance, regulation of mRNA stability, and splicing by antagonizing or enhancing other splicing proteins. HNRNPA1 can also bind to AU-rich elements and UAG GGA (U)-motifs in the 3'UTRs, and possesses RNA chaperone activity, promoting RNA-RNA interactions. In addition, HNRNPA1 has been implicated in transcriptional activation by binding to and destabilizing G-quadruplex structures within promoters [40][41][42] . Hence deregulation of HNRNPA1 abundance may have diverse and indirect consequences.
Our work here demonstrates an isoform switch for HNRNPA1 in breast cancers. HNRNPA1 has four similar mRNA isoforms in normal breast tissue as described in the GTEx database. All isoforms mainly differ at their 3′UTRs. Our integrated in silico approach combining isoform level analysis of microarrays, RNA-seq, and singlecell RNA-seq data allowed the discovery and confirmation of the isoform switch. The microarray data clearly showed downregulation of Isoform-1 because distal probes only recognize this isoform. Interestingly, the ratio of other isoforms was high, and the increased expression of these isoform(s) correlated with patient survival. However, it was unclear which isoform was increased because the probes recognized more than one isoform in the microarray data. To this end, the use of GTEx and TCGA datasets revealed isoform-specific expression patterns in breast cancers. Isoform-1, the dominant transcript in mammary tissue, was downregulated in all PAM50 groups compared to GTEx normal tissues. Other minor isoforms (Isoform-3 and 4) were also lower in tumors than adjacent normal or GTEx normal tissue. In contrast, Isoform-2 was the only isoform that was upregulated in breast cancers. This pattern suggested proximal polyadenylation to favor Isoform-2, which has the most proximal poly(A) site, over Isoform-1 and other isoforms with distal poly(A) sites. We wanted to understand the consequence of this switch, and we found that the dominant isoform (Isoform-1) in breast tissue has a unique In support of an oncogenic role, depletion of HNRNPA1 had a significant effect on neoplastic phenotypes in RNAi silenced cell models. The decreased neoplastic phenotypes in vitro were substantial in the RNAi models. Of note, our CRISPR/Cas9 knockout models did not survive. This observation is in agreement with cell dependency scores listed in DepMap and canSAR datasets. Hence these data suggested HNRNPA1 function is critical and possibly not recovered by other members of the HNRNPs. Given all the diverse roles of HNRNPA1, we sought to provide additional insight into the HNRNPA1 function in breast cancers. A high throughput miRNA expression assay showed a global downregulation of mature miRNA levels. Further analyses showed that the majority of these miRNAs were located within host genes. Among these, we showed miR-21 promoter activity was decreased, and pri-miR-21 levels were downregulated. While the effect on global downregulation of miRNAs is possibly an indirect consequence of HNRNPA1 loss, it will be essential to delineate the HNRNPA1 downstream players responsible for the transcription of pri-miRNAs listed here. Considering these results and the known roles of HNRNPA1 in RNA metabolism 41 , upregulation of HNRNPA1 through the isoform switch may significantly affect different transcriptome components. Of note, while Isoform-2 is upregulated, other isoforms still contribute to HNRNPA1 protein synthesis. The switch enhances protein overexpression but hinders detection of overexpression at the transcript level.
Overall, our data emphasize that focusing on isoform level changes is essential to decoding the cancer transcriptome in higher resolution. This perspective may allow the identification of new oncogene activation cases where overall mRNA levels may not change significantly or common driver mutations do not exist at the genome level. In addition, isoform-specific expression data could also be critical to study isoform-specific post-translational modifications of proteins. Therefore, looking for isoform switches in cancer transcriptomes is a promising strategy to discover new cancer genes with biological impact. The isoform switch we describe for HNRNPA1 has implications in breast cancer and possibly other malignancies.
NanoString nCounter miRNA assay. NanoString nCounter Human miRNA V3 was performed according to manufacturer's instructions (NanoString Technologies) at CanSyL/M.E.T.U. Significance was calculated using an unpaired t-test for the three technical replicates. Significant expression changes were listed based on fold changes (< 0.6 and > 1.5) and p-values (p < 0.05). Heatmaps were drawn using GraphPad Prism 8.0.2. Biological pathways affected by miRNAs were determined by using DIANA TOOLS mirPath v.3 57 .
Survival analysis. Expression of isoforms, determined by the proximal probes of 200016_x_at, was used to group patients in the GSE31519 dataset. Patients were grouped according to top 25% (High, n = 90) and bottom 25% (Low, n = 90) expressers. The survival graph for HNRNPA1 protein was from the cohort described in Tang et al. 28 . Hazard ratio (HR) with 95% confidence intervals and log-rank p-value were calculated using Kaplan-Meier Plotter 58 .